Improved search strategy for large vocabulary continuous Mandarin speech recognition

نویسندگان

  • Tai-Hsuan Ho
  • Kae-Cherng Yang
  • Kuo-Hsun Huang
  • Lin-Shan Lee
چکیده

This paper presents a new search strategy for large vocabulary continuous Mandarin speech recognition considering the special structure of Chinese language. This strategy is composed of a forward and a backward passes, between which a high-quality syllable lattice is generated to bridge the syllable-level and word-level decoding processes. In the forward pass, considering the small number of syllables in Chinese language, a framesynchronous stack decoder is used to integrate the high-order syllable N-Gram language model, so as to generate a very accurate and compact syllable lattice. In the backward pass, considering the special monosyllabic wording structure in Chinese language, the search space for the word-level decoding is expanded dynamically from the syllable lattice, and the best word sequence is extracted based on the knowledge provided by the word pronunciation lexicon and the word N-Gram language model. In the preliminary experiments, it was found that, with this strategy, the character error rate can be reduced by more than 20% as compared with a previous system using syllable-aligned lattice approach on a speaker-adaptive continuous speech recognition task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phonetic state tied-mixture tone modeling for large vocabulary continuous Mandarin speech recognition

This paper presents a new approach to tone modeling for continuous Mandarin speech recognition. Mandarin tones provide rich information for speech recognition. In this paper, we treat the tone as an attribute of the final vowel part of a Mandarin syllable. Separate distributions are estimated for cepstral coefficients and pitch features respectively, and the phonetic state tied-mixture techniqu...

متن کامل

Use of prosodic information to integrate acoustic and linguistic knowledge in continuous Mandarin speech recognition with very large vocabulary

This paper presents a new approach to use prosodic information for the integration of acoustic and linguistic knowledge in continuous Mandarin speech with very large vocabulary. Since the overhead computation incurred from unification of search space is confined to the syllable boundaries, the use of prosodic information to reduce the syllable boundary hypotheses as well as the syllable matchin...

متن کامل

Modeling Lexical Tones for Mandarin Large Vocabulary Continuous Speech Recognition

Modeling Lexical Tones for Mandarin Large Vocabulary Continuous Speech Recognition

متن کامل

A Recurrent Neural Network Based Finite State Machine for Fast Continuous Mandarin Speech Recognitionz

In this paper, a novel recurrent neural network (RNN) based nite state machine (FSM) front-end processor is proposed. It is integrated with the DP search into a whole for fast continuous Mandarin speech recognition. The FSM pre-classiies each speech frame into three stable states and a transient state. Diierent search spaces are then set in the following DP search for these four states in order...

متن کامل

Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary but limited training data

This correspondence presents the first known results of complete recognition of continuous Mandarin speech for the Chinese language with very large vocabulary but very limited training data. Various acoustic and linguistic processing techniques were developed, and a prototype system of a continuous speech Mandarin dictation machine has been successfully implemented. The best recognition accurac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998